A Generative Language Modeling Approach for Ranking Entities

نویسندگان

  • Wouter Weerkamp
  • Krisztian Balog
  • Edgar Meij
چکیده

We describe our participation in the INEX 2008 Entity Ranking track. We develop a generative language modeling approach for the entity ranking and list completion tasks. Our framework comprises the following components: (i) entity and (ii) query language models, (iii) entity prior, (iv) the probability of an entity for a given category, and (v) the probability of an entity given another entity. We explore various ways of estimating these components, and report on our results. We find that improving the estimation of these components has very positive effects on performance, yet, there is room for further improvements.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Invited Talk: Learning probability by comparison

Learning probability by probabilistic modeling is a major task in statistical machine learning and it has traditionally been supported by maximum likelihood estimation applied to generative models or by a local maximizer applied to discriminative models. In this talk, we introduce a third approach, an innovative one that learns probability by comparing probabilistic events. In our approach, we ...

متن کامل

LDA Based Similarity Modeling for Question Answering

We present an exploration of generative modeling for the question answering (QA) task to rank candidate passages. We investigate Latent Dirichlet Allocation (LDA) models to obtain ranking scores based on a novel similarity measure between a natural language question posed by the user and a candidate passage. We construct two models each one introducing deeper evaluations on latent characteristi...

متن کامل

A Topic Modeling Approach to Ranking

We propose a topic modeling approach to the prediction of preferences in pairwise comparisons. We develop a new generative model for pairwise comparisons that accounts for multiple shared latent rankings that are prevalent in a population of users. This new model also captures inconsistent user behavior in a natural way. We show how the estimation of latent rankings in the new generative model ...

متن کامل

LIA-iSmart at TREC 2010: An Unsupervised Web-Based Approach for Filtering Answers

Searching for named entities has been the subject of many researches in information retrieval. Our goal in participating in TREC 2010 Entity Ranking track is to look for reconizing any named entity in arbitrary categories and use this to rank candidate named entities. We propose to address the issue by means of a web oriented language modeling approach.

متن کامل

On the Emergence of Scientific Grammar in Iran

Writing the grammar of a language is one of the most significant outputs of linguistic studies. In Iran, it is Avicenna (Ibn-e Sina) who is credited with the first such compilation of the Persian language. Understanding the weaknesses associated with the traditional trends of grammar writing in Iran, contemporary Iranian linguists adopted the modern Western approach following the Chomskyan Turn...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008